Entropies Defined by Parsing Algorithms

نویسنده

  • B. I. Mills
چکیده

Common deterministic measures of the information content of symbolic strings revolve around the resources used in describing or parsing the string. The well known and successful Lempel-Ziv parsing process is described briefly, and compared to the lessor known Titchener parsing process that might have certain theoretical advantages in the study of the nature of deterministic information in strings. Common to the two methods we find that the maximal complexity is asymptotic to hn/ logn, where h is a probabilistic entropy and n is the length of the string. By considering a generic parsing process that can be used to define string complexity, it is shown that this complexity bound appears as a consequence of the counting of unique words, rather than being a result specific to any particular parsing process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

Relationship among Complexities of Individual Sequences over Countable Alphabet

This paper investigates some relations among four complexities of sequence over countably infinite alphabet, and shows that two kinds of empirical entropies and the self-entropy regarding a finite state source are asymptotically equal and lower bounded by the muximun number of phrases in distinct parsing of the sequence. Some connections with source coding theorems are also investigated. Furthe...

متن کامل

تأثیر ساخت‌واژه‌ها در تجزیه وابستگی زبان فارسی

Data-driven systems can be adapted to different languages and domains easily. Using this trend in dependency parsing was lead to introduce data-driven approaches. Existence of appreciate corpora that contain sentences and theirs associated dependency trees are the only pre-requirement in data-driven approaches. Despite obtaining high accurate results for dependency parsing task in English langu...

متن کامل

جواب‌های وابسته به زمان معادلات فوکر- پلانک غیر خطی مربوط به توابع دلخواه از آنتروپی تسالیس

The nonlinear Fokker-Plank equations can be related to generalized entropies. We investigate the stationary solutions of Fokker- Plank equations which are related to entropies defined as arbitrary functions of Tsallis entropy. Also the transient solutions of the equations are determined for linaer drifts.

متن کامل

Generalised LR parsing algorithms

This thesis concerns the parsing of context-free grammars. A parser is a tool, defined for a specific grammar, that constructs a syntactic representation of an input string and determines if the string is grammatically correct or not. An algorithm that is capable of parsing any context-free grammar is called a generalised (contextfree) parser. This thesis is devoted to the theoretical analysis ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003